Introduction to Finite-State Devices in Natural Language Processing
نویسنده
چکیده
The theory of finite-state automata (FSA) is rich and finite-state automata techniques have been used in a wide range of domains, such as switching theory, pattern matching, pattern recognition, speech processing, hand writing recognition, optical character recognition, encryption algorithm, data compression, indexing and operating system analysis (Petri-net). In this chapter, we describe the basic notions of finite-state automata and finite-state transducers. We also describe the fundamental properties of these machines while illustrating their use. We give simple formal language examples as well as natural language examples. We also illustrate some of the main algorithms used with finite-state automata and transducers. Finite-State Devices for Natural Language Processing, Roche and Schabes (Editors), MIT Press This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved. Copyright c ©Mitsubishi Electric Research Laboratories, Inc., 1996 201 Broadway, Cambridge, Massachusetts 02139
منابع مشابه
Strengths and weaknesses of finite-state technology: a case study in morphological grammar development
Finite-state technology is considered the preferred model for representing the phonology and morphology of natural languages. The attractiveness of this technology for natural language processing stems from four sources: modularity of the design, due to the closure properties of regular languages and relations; the compact representation that is achieved through minimization; efficiency, which ...
متن کاملStatistics of Morphological Finite-State Transition Networks Obey the Power Law
Finite-state devices are widely used in natural language processing, yet little if anything is known about metrics and topology of finite-state transition graphs. Here we study numerically the structure of directed state transition graphs for several types of finite-state devices representing morphology of 16 languages. In all experiments we have found that distribution of incoming and outcomin...
متن کامل1 Formal Language Theory
This chapter provides a gentle introduction to formal language theory, aimed at readers with little background in formal systems. The motivation is natural language processing (NLP), and the presentation is geared towards NLP applications, with linguistically motivated examples, but without compromising mathematical rigor. The text covers elementary formal language theory, including: regular la...
متن کاملFinite-State Technology as a Programming Environment
Finite-state technology is considered the preferred model for representing the phonology and morphology of natural languages. The attractiveness of this technology for natural language processing stems from four sources: modularity of the design, due to the closure properties of regular languages and relations; the compact representation that is achieved through minimization; efficiency, which ...
متن کاملFinite State Transducers with Predicates and Identities
An extension to finite state transducers is presented, in which atomic symbols are replaced by arbitrary predicates over symbols. The extension is motivated by applications in natural language processing (but may be more widely applicable) as well as by the observation that transducers with predicates generally have fewer states and fewer transitions. Although the extension is fairly trivial fo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996